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ModelCC is a model-based parser generator that decouples language design from language pro- 
cessing. ModelCC provides two different mechanisms to specify the mapping from an abstract 
syntax model to a concrete syntax model: metadata annotations defined on top of the abstract 
syntax model specification and a domain-specific language for defining ASM-CSM mappings. Us- 
ing a domain-specific language to specify the mapping from abstract to concrete syntax models 
allows the definition of multiple concrete syntax models for the same abstract syntax model. In 
this paper, we describe the ModelCC domain-specific language for abstract syntax model to con- 
crete syntax model mappings and we showcase its capabilities by providing a meta-definition of 
that domain-specific language. 



I. INTRODUCTION 

Model-based language specification techniques [3] decou- 
ple language design from language processing and auto- 
matically generate the corresponding language grammar, 
thus making the language design process less arduous. 

ModelCC 0] is a model-based parser generator that 
allows the specification of the abstract syntax elements. a 
language as a set of classes, which represent language 
elements, and relationships between those classes or lan- 
guage elements. 

ModelCC allows mapping the abstract syntax model 
to concrete syntax models by imposing constraints over 
language elements and th(i.e. sort of a s using either 
metadata annotations or a domain specific language for 
the specification of language constraints (i.e. sort of a 
metalanguage). 

In this paper, we propose the ModelCC domain- 
specific language for abstract syntax model to concrete 
syntax model mappings (from now on referred as the 
ModelCC DSL for ASM-CSM mappings) and present 
its specification in a model-based way using ModelCC. 
This domain-specific language ultimately allows model- 
based parser generators to decouple abstract syntax mod- 
els from concrete syntax models. 

Section [IT] introduces model-based language specifica- 
tion and the ModelCC model-based parser generator. 
Section Im] describes ModelCC the ModelCC domain- 
specific language for ASM-CSM mappings. Finally, Sec- 
tion |IV] presents our conclusions and future work. 



II. MODEL-BASED LANGUAGE SPECIFICATION 

Most existing language specification techniques [l| re- 
quire the language designer to provide a textual speci- 
fication of the language grammar. The proper specifi- 
cation of such a grammar is a nontrivial process that 
depends on the lexical and syntax analysis techniques 
to be used, since each kind of technique requires the 



grammar to comply with a specific set of constraints. 
Each analysis technique is characterized by its expres- 
sion power and this expression power determines whether 
a given analysis technique is suitable for a particular lan- 
guage. The most significant constraints on formal lan- 
guage specification originate from the need to consider 
context-sensitivity, the need to perform an efficient anal- 
ysis, and some techniques' inability to resolve conflicts 
caused by grammar ambiguities. 

In practice, when we want to build a complex data 
structure from an input codified using a specific syntax, 
the implementation of the mandatory language processor 
requires the software engineer to build a grammar-based 
language specification for the input data and also to im- 
plement the conversion from the parse tree returned by 
the parser to the desired data structure, which is an in- 
stance of the data model that describes the input data. 

Whenever the language specification has to be mod- 
ified, the language designer has to manually propagate 
changes throughout the entire language processor tool 
chain, from the specification of the grammar defining 
the formal language (and its adaptation to specific pars- 
ing tools) to the corresponding data model. These up- 
dates are time-consuming, tedious, and error-prone. By 
making such changes labor-intensive, the traditional lan- 
guage processing approach hampers the maintainability 
and evolution of the language used to represent the data 

a- . . 

Moreover, it is not uncommon for different applica- 
tions to use the same language. For example, the com- 
piler, different code generators, and other tools within an 
IDE, such as the editor or the debugger, typically need to 
grapple with the full syntax of a programming language. 
Unfortunately, their maintenance typically requires keep- 
ing several copies of the same language specification in 
sync. 

The idea behind model-based language specification is 
that, starting from a single abstract syntax model (ASM) 
that represents the core concepts in a language, language 
designers can develop one or several concrete syntax mod- 
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Figure 1: Traditional language processing. 
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Figure 2: Model-based language processing. 



structures. Hence the use of the 'abstract syntax graph' 
term in Figure [51 When the ASM represents a tree-like 
structure, a model-based parser generator is equivalent to 
a traditional grammar-based parser generator in terms 
of expression power. When the ASM represents non- 
tree structures, reference resolution techniques can be 
employed to make model-based parser generators more 
powerful than grammar-based ones. 

ModelCC jl, [1| is a parser generator that supports a 
model-based approach to the design of language process- 
ing systems. Its starting ASM is created by defining 
classes that represent language elements and establish- 
ing relationships among those elements. Once the ASM 
is established, constraints can be imposed over language 
elements and their relationships as annotations in order 
to produce the desired ASM-CSM mappings. 

The ASM is built on top of basic language elements, 
which can be viewed as the tokens in the model-driven 
specification of a language. ModelCC provides the nec- 
essary mechanisms to combine those basic elements into 
more complex language constructs, which correspond to 
the use of concatenation, selection, and repetition in the 
syntax-driven specification of languages. 



els (CSMs). These CSMs can suit the specific needs of the 
desired textual or graphical representation. The ASM- 
CSM mappings can be performed, for instance, by an- 
notating the abstract syntax model with the constraints 
needed to transform the elements in the abstract syntax 
into their concrete representation. 

This way, the ASM representing the language can be 
modified as needed without having to worry about the 
language processor and the peculiarities of the chosen 
parsing technique, since the corresponding language pro- 
cessor will be automatically updated. In this case, the 
language designer does not have to manually propagate 
changes throughout the language processor tool chain. 
Also, when different applications use the same language, 
there is no need to keep or maintain duplicate language 
models. 

Finally, as the ASM is not bound to a particular 
parsing technique, evaluating alternative and/or com- 
plementary parsing techniques is possible without hav- 
ing to propagate their constraints into the language 
model. Therefore, by using an ASM, model-based lan- 
guage specification completely decouples language spec- 
ification from language processing, which can be per- 
formed using whichever parsing techniques are suitable 
for the formal language implicitly defined by the abstract 
model and its concrete mapping. 

A diagram summarizing the traditional language de- 
sign process is shown in Figure [1] whereas the corre- 
sponding diagram for the model-based approach is shown 
in Figured! 

It should be noted that ASMs may represent non-tree 



III. MODELCC DOMAIN-SPECIFIC LANGUAGE FOR 
ASM-CSM MAPPINGS 

In ModelCC, the constraints imposed over ASMs to de- 
fine a particular ASM-CSM mapping can be declared 
as metadata annotations on the model itself. Now sup- 
ported by all the major programming platforms, meta- 
data annotations are often used in reflective program- 
ming and code generation 0. Table U summarizes the 
set of constraints supported by ModelCC for establishing 
ASM-CSM mappings between ASMs and their concrete 
representation in textual CSMs. 

However, in order to allow the developer to specify 
several ASM-CSM mappings, ModelCC also allows the 
specification of separate sets of constraints by using the 
ModelCC DSL for ASM-CSM mappings. 

Using the ModelCC DSL for ASM-CSM mappings 
instead of metadata annotations to specify ASM-CSM 
mappings allows the specification of several ASM-CSM 
mappings for the same ASM by means of separate con- 
straint specification files. This ultimately allows for the 
proper model-based decoupling of language specification 
and language processing. 

In this section, we describe the ModelCC DSL for 
ASM-CSM mappings. We provide the ModelCC imple- 
mentation of that DSL as an ASM complemented with 
metadata annotations. As an example of the usage of 
this language, we also provide the ModelCC implemen- 
tation of the DSL for ASM-CSM mappings as an ASM 
complemented with constraint specification files written 
in the DSL f or ASM -CSM mappings itself. 

Subsection iniZloutfines the Mode lCC D SL for ASM- 
CSM mappings features. Subsection IIII.BI provides the 
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Table I: Summary of the basic metadata annotations supported by ModelCC. 



definition of the ModelCC DSL for ASM-CSM mappings 
as an ASM complemented with metadata annotations. 
Subsection IIII . C I provides the definition of the ModelCC 
DSL for ASM-CSM mappings as an ASM complemented 
with several equivalent constraint specification files. 

A. Language Features 

The ModelCC DSL for ASM-CSM mappings supports 
the following features: 

• The definition of constraints on patterns, delim- 
iters, evaluation order, and references between lan- 
guage elements. 

• The property-like specification of constraints for 
language elements and language element members. 

• The grammar-like specification of the concrete syn- 
tax of language elements by means of a regular- 
expression-like language. 

While the semantics of property-like constraint defini- 
tions is equivalent to that of metadata annotation con- 
straint definitions, grammar-like constraint specification 
allows for a more intuitive specification of ASM-CSM 
mappings. 

Grammar-like constraint definitions may be more in- 
tuitive to traditional language designers who are famil- 
iar with syntax-driven language specification tools. Such 
constraint definitions can be redundant with the ASM 
as, for example, they can also include multiplicity con- 
straints. ModelCC checks and reports if any syntax im- 
plicit in grammar-like constraint definitions confiicts with 
the language ASM. 

Finally, ModelCC checks, reports, and ignores any con- 
straints on language elements on language element mem- 
bers that do not exist. 



It should be noted that all the features of the Mod- 
elCC DSL for ASM-CSM mappings make of ModelCC a 
complete model-based language workbench. 



B. ModelCC Definition of the DSL for ASIVl-CSM 
mappings 

The ASM of the ModelCC DSL for ASM-CSM mappings 
is designed first, and then it is mapped to a CSM by 
imposing constraints by means of metadata annotations 
on the model classes. 

The resulting model can be processed by ModelCC to 
generate the corresponding parser. The ModelCC lan- 
guage model (depicted as an UML class diagram) in Fig- 
ure [3] presents the ModelCC DSL for ASM-CSM map- 
pings. 

This Figure highlights two of the reasons why a DSL 
for ASM-CSM mappings is needed: 

• When metadata annotations are used to define 
ASM-CSM mappings on top of the ASM, the con- 
crete syntax is interleaved in the abstract syntax 
model in a way that burdens it, similar to language 
processing being coupled with language specifica- 
tion in traditional syntax-driven language specifi- 
cation techniques 

• Also, there is no intuitive way to allow the specifica- 
tion of multiple ASM-CSM mappings using meta- 
data annotations. 



C. Separating the ASIVI from the CSM 

Once an initial implementation of the ModelCC DSL for 
ASM-CSM mappings provides a bootstrap, we provide 
implementations of the ModelCC DSL for ASM-CSM 
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Figure 3: Definition of the ModelCC DSL for ASM-CSM mappings in ModelCC. 
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Figure 4: Definition of the abstract syntax model of the ModelCC DSL for ASM-CSM mappings in ModelCC. 



mappings that consist of an ASM and separate constraint 
definitions using that language. 

The ModelCC language model (depicted as an UML 
class diagram) in Figure 2] presents the ASM of the Mod- 
elCC DSL for ASM-CSM mappings. 

Starting from this ASM, we provide three different 
ASM-CSM mappings for the language, all of them writ- 
ten in the ModelCC DSL for ASM-CSM mappings itself. 
By complementing the ASM with any of the three follow- 
ing ASM-CSM mappings or any other equivalent one, we 
will obtain the same language as in the previous Section. 

• Property-like specification Figure [5] presents a 
property-like ASM-CSM mapping written in the 
ModelCC DSL for ASM-CSM mappings. 

The property-like specification of ASM-CSM map- 
pings mimics the specification of constraints on 
ASMs using metadata annotations. It can be ob- 
served that the constraints are specified as proper- 



ties of language elements. 

• Grammar- like specification Figure [S] presents a 
grammar-like ASM-CSM mapping written in the 
ModelCC DSL for ASM-CSM mappings. 

Some of the advantages of grammar-like mappings 
can be observed in the specification of the Con- 
straintDefinition language element constraints. A 
single constraint specification can include prefix 
constraints, suffix constraints, and language ele- 
ment member order constraints. Also, the spec- 
ification of the ConstraintDefinition language el- 
ement constraints includes two multiplicity con- 
straints (optionality, represented by the regex-like 
"?" operator) that are redundant with the ASM. 
ModelCC will check these multiplicity constraints 
for consistency with the ASM, and will report any 
confiict in parser generation time. 

Another illustrative case of grammar-like mappings 
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ConstraintDef inition. constraint ID [prefix] " \ [" 
ConstraintDef inition. constraintID [suffix] "\] " 
ConstraintDef inition. constraint [prefix] : " : " 
Element .name [separator] : " . " 
Identifier. name: " [a-zA-Z] [a-zA-Z0-9_] *" 
ClausureSpecification [suffix] : "\*" 
OptionalSpecification [suffix] : "\?" 
PositiveClauseSpecif ication[pref ix] : "\+" 
ParenthesizedSpecification [prefix] : "\(" 
ParenthesizedSpecif ication[suf f ix] : "\)" 
SequenceSpecification [precedes] : AlternationSpecif ication 

PrecedenceSpecif ication 
ConstraintSpecif ication: SequenceSpecification < PrecedenceSpecif ication 
AlternationSpecif ication. constraints [separator] : 
PrecedenceSpecif ication [precedes] : AlternationSpecif ication 
PrecedenceSpecif ication. constraints [sepeirator] : "\<" 
Boolean. value: "true I false" 
Integer . value : " [0-9] +" 

Figure 5: Property-like specification of tlie mapping from tfie abstract syntax model to the concrete syntax model of 
ModelCC DSL for ASM-CSM mappings, written in the ModelCC DSL for ASM-CSM mappings itself. 



ConstraintDefinition: target ("[" constraintID "]")? (":" constraint)? 

Element: name ("." name)* 

Identifier .name: " [a-zA-Z] [a-zA-Z0-9_] *" 

ClausureSpecification: constraint "\*" 

OptionalSpecification: constraint "\?" 

PositiveClauseSpecif ication: constraint "\+" 

ParenthesizedSpecification: "\(" constraint "\)" 

ConstraintSpecif ication: SequenceSpecification < PrecedenceSpecif ication 

< AlternationSpecif ication 

AlternationSpecif ication: constraints ("\l" constraints)* 
PrecedenceSpecif ication: constraints ("\<" constraints)* 
Boolean. value: "true I false" 
Integer. value: "[0-9]+" 

Figure 6: Grammar-like specification of the mapping from the abstract syntax model to the concrete syntax model of 
ModelCC DSL for ASM-CSM mappings, written in the ModelCC DSL for ASM-CSM mappings itself. 



ConstraintDefinition: " [" constraintID "] " 
ConstraintDefinition: ":" constraint 
Element . name [separator] : " . " 
Identifier .name: " [a-zA-Z] [a-zA-Z0-9_] *" 
ClausureSpecification: constraint "\*" 
OptionalSpecification: constraint "\?" 
PositiveClauseSpecif ication: constraint "\+" 
ParenthesizedSpecification: "\(" constraint "\)" 

ConstraintSpecif ication: SequenceSpecification < PrecedenceSpecif ication 

< AlternationSpecif ication 

AlternationSpecif ication. constraints [separator] : "\l" 
PrecedenceSpecif ication. constraints [separator] : "\<" 
Boolean. value: "true I false" 
Integer. value: "[0-9]+" 

Figure 7: Mixed specification of the mapping from the abstract syntax model to the concrete syntax model of ModelCC DSL 
for ASM-CSM mappings, written in the ModelCC DSL for ASM-CSM mappings itself. 
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can be observed in the specification of tlie Ele- 
ment language element constraints. Although its 
member name is defined as a list in the ASM, the 
grammar-like constraint specification uses a classi- 
cal explicit-list specification to specify the separa- 
tor for list members. 

• Mixed specification Figure [7] presents another 
example of an ASM-CSM mapping written in the 
ModelCC DSL for ASM-CSM mappings. In this 
case, some constraints are specified grammar-like 
and some constraints are specified property-like. 

In this case, separators in lists are specified us- 
ing property-like constraint definitions, which may 
seem more intuitive to some language designers. 

It should be noted that constraint definitions dif- 
fer from grammar rules in that several of them 
can be specified for separate members of the same 
language element, as can be observed in the Con- 
straintDefinition language element. 

Finally, it should be noted that ASMs that are com- 
plemented with metadata annotations can be comple- 
mented with files written in the ModelCC DSL for ASM- 
CSM mappings. Such files could redefine constraints that 
are specified in the original annotated ASM. Therefore, 
metadata annotation constraints would represent default 
values that would apply, unless otherwise specified, to all 
the ASM-CSM mappings of a language. 

IV. CONCLUSIONS AND FUTURE WORK 

ModelCC is a model-based parser generator that allows 
using metadata annotations or a domain-specific lan- 
guage to specify abstract syntax model to concrete syntax 
model mappings. 

In this paper, we have proposed and described the 
ModelCC domain-specific language for abstract syntax 
model to concrete syntax model mappings (ModelCC 
DSL for ASM-CSM mappings). 



The ModelCC DSL for ASM-CSM mappings allows the 
specification of separate abstract syntax model to con- 
crete syntax model mappings and, therefore, effectively 
decouples abstract syntax models from concrete syntax 
models. 

As an example, we have specified the ModelCC DSL for 
ASM-CSM mappings as an ASM and several equivalent 
ASM-CSM mappings written in the ModelCC DSL for 
ASM-CSM mappings language itself. 

In the future, we plan to apply model-based language 
specification techniques to problems such as data inte- 
gration and natural language processing. We also plan 
to incorporate different reference resolution techniques to 
ModelCC. 
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